Dataset statistics
| Number of variables | 24 |
|---|---|
| Number of observations | 3079 |
| Missing cells | 4385 |
| Missing cells (%) | 5.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 577.4 KiB |
| Average record size in memory | 192.0 B |
Variable types
| CAT | 11 |
|---|---|
| NUM | 10 |
| BOOL | 3 |
Reproduction
| Analysis started | 2022-11-09 15:22:49.125275 |
|---|---|
| Analysis finished | 2022-11-09 15:23:20.384157 |
| Duration | 31.26 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 3079 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.1 KiB |
| AC000003 | 1 |
|---|---|
| AC002138 | 1 |
| AC002276 | 1 |
| AC002277 | 1 |
| AC002278 | 1 |
| Other values (3074) |
| Value | Count | Frequency (%) | |
| AC000003 | 1 | < 0.1% | |
| AC002138 | 1 | < 0.1% | |
| AC002276 | 1 | < 0.1% | |
| AC002277 | 1 | < 0.1% | |
| AC002278 | 1 | < 0.1% | |
| AC002279 | 1 | < 0.1% | |
| AC002280 | 1 | < 0.1% | |
| AC002281 | 1 | < 0.1% | |
| AC002282 | 1 | < 0.1% | |
| AC002284 | 1 | < 0.1% | |
| Other values (3069) | 3069 | 99.7% |
Unique
| Unique | 3079 ? |
|---|---|
| Unique (%) | 100.0% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
| Distinct | 459 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 2620 |
| Missing (%) | 85.1% |
| Memory size | 24.1 KiB |
| AL100611 | 1 |
|---|---|
| AL100971 | 1 |
| AL100523 | 1 |
| AL100989 | 1 |
| AL100808 | 1 |
| Other values (454) |
| Value | Count | Frequency (%) | |
| AL100611 | 1 | < 0.1% | |
| AL100971 | 1 | < 0.1% | |
| AL100523 | 1 | < 0.1% | |
| AL100989 | 1 | < 0.1% | |
| AL100808 | 1 | < 0.1% | |
| AL100956 | 1 | < 0.1% | |
| AL100770 | 1 | < 0.1% | |
| AL100502 | 1 | < 0.1% | |
| AL100906 | 1 | < 0.1% | |
| AL100236 | 1 | < 0.1% | |
| Other values (449) | 449 | 14.6% | |
| (Missing) | 2620 | 85.1% |
Unique
| Unique | 459 ? |
|---|---|
| Unique (%) | 100.0% |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 3.745371874 |
| Min length | 3 |
SITE_ID
Real number (ℝ≥0)
| Distinct | 10 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 64.31276388 |
|---|---|
| Minimum | 1 |
| Maximum | 86 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 78 |
| median | 79 |
| Q3 | 82 |
| 95-th percentile | 84 |
| Maximum | 86 |
| Range | 85 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 31.68955676 |
|---|---|
| Coefficient of variation (CV) | 0.4927413293 |
| Kurtosis | 0.2418446723 |
| Mean | 64.31276388 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -1.489887519 |
| Sum | 198019 |
| Variance | 1004.228008 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 78 | 746 | 24.2% | |
| 1 | 615 | 20.0% | |
| 82 | 600 | 19.5% | |
| 79 | 537 | 17.4% | |
| 80 | 191 | 6.2% | |
| 81 | 183 | 5.9% | |
| 85 | 63 | 2.0% | |
| 84 | 57 | 1.9% | |
| 83 | 45 | 1.5% | |
| 86 | 42 | 1.4% |
| Value | Count | Frequency (%) | |
| 1 | 615 | 20.0% | |
| 78 | 746 | 24.2% | |
| 79 | 537 | 17.4% | |
| 80 | 191 | 6.2% | |
| 81 | 183 | 5.9% |
| Value | Count | Frequency (%) | |
| 86 | 42 | 1.4% | |
| 85 | 63 | 2.0% | |
| 84 | 57 | 1.9% | |
| 83 | 45 | 1.5% | |
| 82 | 600 | 19.5% |
PATIENT_ID
Real number (ℝ≥0)
| Distinct | 952 |
|---|---|
| Distinct (%) | 30.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 327.1646639 |
|---|---|
| Minimum | 1 |
| Maximum | 997 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 18 |
| Q1 | 114 |
| median | 295 |
| Q3 | 496 |
| 95-th percentile | 812.1 |
| Maximum | 997 |
| Range | 996 |
| Interquartile range (IQR) | 382 |
Descriptive statistics
| Standard deviation | 244.9938241 |
|---|---|
| Coefficient of variation (CV) | 0.7488395025 |
| Kurtosis | -0.3835799867 |
| Mean | 327.1646639 |
| Median Absolute Deviation (MAD) | 190 |
| Skewness | 0.6233073065 |
| Sum | 1007340 |
| Variance | 60021.97385 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 21 | 9 | 0.3% | |
| 19 | 9 | 0.3% | |
| 6 | 9 | 0.3% | |
| 34 | 9 | 0.3% | |
| 8 | 9 | 0.3% | |
| 31 | 9 | 0.3% | |
| 16 | 9 | 0.3% | |
| 4 | 9 | 0.3% | |
| 5 | 9 | 0.3% | |
| 17 | 9 | 0.3% | |
| Other values (942) | 2989 | 97.1% |
| Value | Count | Frequency (%) | |
| 1 | 9 | 0.3% | |
| 2 | 9 | 0.3% | |
| 3 | 9 | 0.3% | |
| 4 | 9 | 0.3% | |
| 5 | 9 | 0.3% |
| Value | Count | Frequency (%) | |
| 997 | 1 | < 0.1% | |
| 995 | 1 | < 0.1% | |
| 994 | 1 | < 0.1% | |
| 993 | 1 | < 0.1% | |
| 992 | 1 | < 0.1% |
| Distinct | 1064 |
|---|---|
| Distinct (%) | 34.6% |
| Missing | 7 |
| Missing (%) | 0.2% |
| Memory size | 24.1 KiB |
| 7/17/08 | 40 |
|---|---|
| 11/10/10 | 26 |
| 10/23/12 | 25 |
| 11/9/11 | 25 |
| 10/24/12 | 23 |
| Other values (1059) |
| Value | Count | Frequency (%) | |
| 7/17/08 | 40 | 1.3% | |
| 11/10/10 | 26 | 0.8% | |
| 10/23/12 | 25 | 0.8% | |
| 11/9/11 | 25 | 0.8% | |
| 10/24/12 | 23 | 0.7% | |
| 11/11/09 | 15 | 0.5% | |
| 2/28/07 | 12 | 0.4% | |
| 10/24/07 | 12 | 0.4% | |
| 3/28/07 | 11 | 0.4% | |
| 11/14/07 | 11 | 0.4% | |
| Other values (1054) | 2872 | 93.3% |
Unique
| Unique | 345 ? |
|---|---|
| Unique (%) | 11.2% |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.991230919 |
| Min length | 3 |
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 704 |
| Missing (%) | 22.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.598315789 |
|---|---|
| Minimum | 1 |
| Maximum | 6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 4 |
| Maximum | 6 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.244836558 |
|---|---|
| Coefficient of variation (CV) | 0.7788426829 |
| Kurtosis | 4.402739134 |
| Mean | 1.598315789 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.296742701 |
| Sum | 3796 |
| Variance | 1.549618055 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 1736 | 56.4% | |
| 2 | 341 | 11.1% | |
| 4 | 166 | 5.4% | |
| 6 | 86 | 2.8% | |
| 5 | 30 | 1.0% | |
| 3 | 16 | 0.5% | |
| (Missing) | 704 | 22.9% |
| Value | Count | Frequency (%) | |
| 1 | 1736 | 56.4% | |
| 2 | 341 | 11.1% | |
| 3 | 16 | 0.5% | |
| 4 | 166 | 5.4% | |
| 5 | 30 | 1.0% |
| Value | Count | Frequency (%) | |
| 6 | 86 | 2.8% | |
| 5 | 30 | 1.0% | |
| 4 | 166 | 5.4% | |
| 3 | 16 | 0.5% | |
| 2 | 341 | 11.1% |
DIAGDIS2 1=Control 2=Case
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 3 |
| Missing (%) | 0.1% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 2375 | 77.1% | |
| 1 | 701 | 22.8% | |
| (Missing) | 3 | 0.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.1 KiB |
| 0 |
|---|
| Value | Count | Frequency (%) | |
| 0 | 3079 | 100.0% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.1 KiB |
| 0 |
|---|
| Value | Count | Frequency (%) | |
| 0 | 3079 | 100.0% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.1 KiB |
| 0 |
|---|
| Value | Count | Frequency (%) | |
| 0 | 3079 | 100.0% |
ETHYOU - Participant ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know
Categorical
MISSING| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 39 |
| Missing (%) | 1.3% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 | 130 |
| 3 | 46 |
| Value | Count | Frequency (%) | |
| 2 | 2864 | 93.0% | |
| 1 | 130 | 4.2% | |
| 3 | 46 | 1.5% | |
| (Missing) | 39 | 1.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
ETHFATH - Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know
Categorical
MISSING| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 82 |
| Missing (%) | 2.7% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 | 112 |
| 3 | 61 |
| Value | Count | Frequency (%) | |
| 2 | 2824 | 91.7% | |
| 1 | 112 | 3.6% | |
| 3 | 61 | 2.0% | |
| (Missing) | 82 | 2.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
ETHMOTH - Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know
Categorical
MISSING| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 81 |
| Missing (%) | 2.6% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 | 110 |
| 3 | 54 |
| Value | Count | Frequency (%) | |
| 2 | 2834 | 92.0% | |
| 1 | 110 | 3.6% | |
| 3 | 54 | 1.8% | |
| (Missing) | 81 | 2.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
ETHFF - Father's Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know
Categorical
HIGH CORRELATIONHIGH CORRELATIONMISSING| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 86 |
| Missing (%) | 2.8% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 | 110 |
| 3 | 86 |
| Value | Count | Frequency (%) | |
| 2 | 2797 | 90.8% | |
| 1 | 110 | 3.6% | |
| 3 | 86 | 2.8% | |
| (Missing) | 86 | 2.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
ETHFM - Father's Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know
Categorical
HIGH CORRELATIONHIGH CORRELATIONMISSING| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 86 |
| Missing (%) | 2.8% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 | 99 |
| 3 | 82 |
| Value | Count | Frequency (%) | |
| 2 | 2812 | 91.3% | |
| 1 | 99 | 3.2% | |
| 3 | 82 | 2.7% | |
| (Missing) | 86 | 2.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
ETHMF - Mother's Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know
Categorical
HIGH CORRELATIONMISSING| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 86 |
| Missing (%) | 2.8% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 | 99 |
| 3 | 81 |
| Value | Count | Frequency (%) | |
| 2 | 2813 | 91.4% | |
| 1 | 99 | 3.2% | |
| 3 | 81 | 2.6% | |
| (Missing) | 86 | 2.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
ETHMM - Mother's Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know
Categorical
HIGH CORRELATIONMISSING| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 85 |
| Missing (%) | 2.8% |
| Memory size | 24.1 KiB |
| 2 | |
|---|---|
| 1 | 104 |
| 3 | 73 |
| Value | Count | Frequency (%) | |
| 2 | 2817 | 91.5% | |
| 1 | 104 | 3.4% | |
| 3 | 73 | 2.4% | |
| (Missing) | 85 | 2.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
RACYOU - Participant race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know
Real number (ℝ≥0)
| Distinct | 8 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 10 |
| Missing (%) | 0.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.728901922 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.856459161 |
|---|---|
| Coefficient of variation (CV) | 0.1272806724 |
| Kurtosis | 14.52398326 |
| Mean | 6.728901922 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.521296235 |
| Sum | 20651 |
| Variance | 0.7335222945 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 2698 | 87.6% | |
| 5 | 272 | 8.8% | |
| 4 | 30 | 1.0% | |
| 8 | 21 | 0.7% | |
| 2 | 16 | 0.5% | |
| 3 | 14 | 0.5% | |
| 1 | 13 | 0.4% | |
| 6 | 5 | 0.2% | |
| (Missing) | 10 | 0.3% |
| Value | Count | Frequency (%) | |
| 1 | 13 | 0.4% | |
| 2 | 16 | 0.5% | |
| 3 | 14 | 0.5% | |
| 4 | 30 | 1.0% | |
| 5 | 272 | 8.8% |
| Value | Count | Frequency (%) | |
| 8 | 21 | 0.7% | |
| 7 | 2698 | 87.6% | |
| 6 | 5 | 0.2% | |
| 5 | 272 | 8.8% | |
| 4 | 30 | 1.0% |
RACFATH - Father's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know
Real number (ℝ≥0)
MISSING| Distinct | 8 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 80 |
| Missing (%) | 2.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.718572858 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.9501370099 |
|---|---|
| Coefficient of variation (CV) | 0.1414194696 |
| Kurtosis | 14.88766626 |
| Mean | 6.718572858 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.597205925 |
| Sum | 20149 |
| Variance | 0.9027603375 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 2607 | 84.7% | |
| 5 | 249 | 8.1% | |
| 8 | 51 | 1.7% | |
| 4 | 27 | 0.9% | |
| 1 | 25 | 0.8% | |
| 2 | 21 | 0.7% | |
| 3 | 14 | 0.5% | |
| 6 | 5 | 0.2% | |
| (Missing) | 80 | 2.6% |
| Value | Count | Frequency (%) | |
| 1 | 25 | 0.8% | |
| 2 | 21 | 0.7% | |
| 3 | 14 | 0.5% | |
| 4 | 27 | 0.9% | |
| 5 | 249 | 8.1% |
| Value | Count | Frequency (%) | |
| 8 | 51 | 1.7% | |
| 7 | 2607 | 84.7% | |
| 6 | 5 | 0.2% | |
| 5 | 249 | 8.1% | |
| 4 | 27 | 0.9% |
RACMOTH - Mother's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know
Real number (ℝ≥0)
MISSING| Distinct | 8 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 79 |
| Missing (%) | 2.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.740666667 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.873880681 |
|---|---|
| Coefficient of variation (CV) | 0.1296430641 |
| Kurtosis | 14.91233829 |
| Mean | 6.740666667 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.554826906 |
| Sum | 20222 |
| Variance | 0.7636674447 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 2627 | 85.3% | |
| 5 | 251 | 8.2% | |
| 8 | 44 | 1.4% | |
| 4 | 30 | 1.0% | |
| 2 | 19 | 0.6% | |
| 1 | 14 | 0.5% | |
| 3 | 12 | 0.4% | |
| 6 | 3 | 0.1% | |
| (Missing) | 79 | 2.6% |
| Value | Count | Frequency (%) | |
| 1 | 14 | 0.5% | |
| 2 | 19 | 0.6% | |
| 3 | 12 | 0.4% | |
| 4 | 30 | 1.0% | |
| 5 | 251 | 8.2% |
| Value | Count | Frequency (%) | |
| 8 | 44 | 1.4% | |
| 7 | 2627 | 85.3% | |
| 6 | 3 | 0.1% | |
| 5 | 251 | 8.2% | |
| 4 | 30 | 1.0% |
MISSING| Distinct | 8 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 84 |
| Missing (%) | 2.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.733222037 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.9742586047 |
|---|---|
| Coefficient of variation (CV) | 0.1446942637 |
| Kurtosis | 15.02698061 |
| Mean | 6.733222037 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.594606642 |
| Sum | 20166 |
| Variance | 0.9491798288 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 2577 | 83.7% | |
| 5 | 235 | 7.6% | |
| 8 | 88 | 2.9% | |
| 1 | 30 | 1.0% | |
| 4 | 27 | 0.9% | |
| 2 | 19 | 0.6% | |
| 3 | 14 | 0.5% | |
| 6 | 5 | 0.2% | |
| (Missing) | 84 | 2.7% |
| Value | Count | Frequency (%) | |
| 1 | 30 | 1.0% | |
| 2 | 19 | 0.6% | |
| 3 | 14 | 0.5% | |
| 4 | 27 | 0.9% | |
| 5 | 235 | 7.6% |
| Value | Count | Frequency (%) | |
| 8 | 88 | 2.9% | |
| 7 | 2577 | 83.7% | |
| 6 | 5 | 0.2% | |
| 5 | 235 | 7.6% | |
| 4 | 27 | 0.9% |
MISSING| Distinct | 8 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 84 |
| Missing (%) | 2.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.699833055 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.062689319 |
|---|---|
| Coefficient of variation (CV) | 0.1586142984 |
| Kurtosis | 13.87025424 |
| Mean | 6.699833055 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.53391468 |
| Sum | 20066 |
| Variance | 1.129308589 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 2561 | 83.2% | |
| 5 | 237 | 7.7% | |
| 8 | 88 | 2.9% | |
| 1 | 45 | 1.5% | |
| 4 | 26 | 0.8% | |
| 2 | 22 | 0.7% | |
| 3 | 13 | 0.4% | |
| 6 | 3 | 0.1% | |
| (Missing) | 84 | 2.7% |
| Value | Count | Frequency (%) | |
| 1 | 45 | 1.5% | |
| 2 | 22 | 0.7% | |
| 3 | 13 | 0.4% | |
| 4 | 26 | 0.8% | |
| 5 | 237 | 7.7% |
| Value | Count | Frequency (%) | |
| 8 | 88 | 2.9% | |
| 7 | 2561 | 83.2% | |
| 6 | 3 | 0.1% | |
| 5 | 237 | 7.7% | |
| 4 | 26 | 0.8% |
MISSING| Distinct | 8 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 84 |
| Missing (%) | 2.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.749248748 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.9080442493 |
|---|---|
| Coefficient of variation (CV) | 0.1345400478 |
| Kurtosis | 14.48166349 |
| Mean | 6.749248748 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.508068678 |
| Sum | 20214 |
| Variance | 0.8245443586 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 2596 | 84.3% | |
| 5 | 233 | 7.6% | |
| 8 | 78 | 2.5% | |
| 4 | 31 | 1.0% | |
| 2 | 24 | 0.8% | |
| 1 | 15 | 0.5% | |
| 3 | 14 | 0.5% | |
| 6 | 4 | 0.1% | |
| (Missing) | 84 | 2.7% |
| Value | Count | Frequency (%) | |
| 1 | 15 | 0.5% | |
| 2 | 24 | 0.8% | |
| 3 | 14 | 0.5% | |
| 4 | 31 | 1.0% | |
| 5 | 233 | 7.6% |
| Value | Count | Frequency (%) | |
| 8 | 78 | 2.5% | |
| 7 | 2596 | 84.3% | |
| 6 | 4 | 0.1% | |
| 5 | 233 | 7.6% | |
| 4 | 31 | 1.0% |
MISSING| Distinct | 8 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 85 |
| Missing (%) | 2.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.739478958 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 24.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 7 |
| median | 7 |
| Q3 | 7 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.9375404813 |
|---|---|
| Coefficient of variation (CV) | 0.1391117158 |
| Kurtosis | 15.72619682 |
| Mean | 6.739478958 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -3.652558984 |
| Sum | 20178 |
| Variance | 0.8789821541 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 7 | 2596 | 84.3% | |
| 5 | 240 | 7.8% | |
| 8 | 73 | 2.4% | |
| 4 | 28 | 0.9% | |
| 1 | 26 | 0.8% | |
| 2 | 18 | 0.6% | |
| 3 | 10 | 0.3% | |
| 6 | 3 | 0.1% | |
| (Missing) | 85 | 2.8% |
| Value | Count | Frequency (%) | |
| 1 | 26 | 0.8% | |
| 2 | 18 | 0.6% | |
| 3 | 10 | 0.3% | |
| 4 | 28 | 0.9% | |
| 5 | 240 | 7.8% |
| Value | Count | Frequency (%) | |
| 8 | 73 | 2.4% | |
| 7 | 2596 | 84.3% | |
| 6 | 3 | 0.1% | |
| 5 | 240 | 7.8% | |
| 4 | 28 | 0.9% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| BARCODE2 | BARCODE2_V2 | SITE_ID | PATIENT_ID | DTBLDR Date of blood draw | DISEASE2 1=MS 2=NMO 3=ON 4=TM 5=ADEM 6=CIS blank = control | DIAGDIS2 1=Control 2=Case | VISIT_ID | VISIT_ID.1 | ROW_NO | ETHYOU - Participant ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHFATH - Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHMOTH - Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHFF - Father's Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHFM - Father's Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHMF - Mother's Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHMM - Mother's Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | RACYOU - Participant race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACFATH - Father's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACMOTH - Mother's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACFF - Father's Father's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACFM - Father's Mother's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACMF - Mother's Father's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACMM - Mother's Mother's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AC000003 | AL100611 | 81 | 21 | 4/9/07 | 1.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 1 | AC000005 | AL100388 | 1 | 22 | 10/18/06 | 1.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 2 | AC000006 | NaN | 1 | 182 | 10/23/07 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3 | AC000007 | NaN | 80 | 70 | 3/17/08 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 4 | AC000009 | NaN | 78 | 325 | 4/4/07 | 4.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 5 | AC000010 | NaN | 1 | 25 | 10/24/06 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 6 | AC000011 | NaN | 79 | 73 | 12/14/06 | 1.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 7 | AC000012 | NaN | 81 | 56 | 11/28/07 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 8 | AC000013 | NaN | 78 | 226 | 9/20/06 | 1.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 9 | AC000014 | AL100162 | 1 | 42 | 11/21/06 | 1.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
Last rows
| BARCODE2 | BARCODE2_V2 | SITE_ID | PATIENT_ID | DTBLDR Date of blood draw | DISEASE2 1=MS 2=NMO 3=ON 4=TM 5=ADEM 6=CIS blank = control | DIAGDIS2 1=Control 2=Case | VISIT_ID | VISIT_ID.1 | ROW_NO | ETHYOU - Participant ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHFATH - Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHMOTH - Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHFF - Father's Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHFM - Father's Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHMF - Mother's Father's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | ETHMM - Mother's Mother's ethnicity 1=Hispanic or Latino 2=Non Hispanic or Latino 3=Don't know | RACYOU - Participant race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACFATH - Father's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACMOTH - Mother's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACFF - Father's Father's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACFM - Father's Mother's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACMF - Mother's Father's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | RACMM - Mother's Mother's race 1=American Indian or Alaska native 2=Middle Eastern 3=South Asian 4=Other Asian 5=Black or African American 6=Native Hawaiian or other Pacific Islander 7=White 8=Don't know | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3069 | PC001032 | NaN | 78 | 926 | 6/15/11 | 1.0 | 2.0 | 0 | 0 | 0 | 3.0 | 3.0 | 3.0 | 3.0 | 3.0 | 3.0 | 3.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3070 | PC001037 | NaN | 82 | 398 | 7/11/11 | 2.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3071 | PC001038 | NaN | 78 | 953 | 2/13/12 | 2.0 | 2.0 | 0 | 0 | 0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 8.0 | 8.0 | 8.0 | 8.0 | 8.0 | 8.0 | 8.0 |
| 3072 | PC001042 | NaN | 82 | 433 | 11/9/11 | 2.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3073 | PC001058 | NaN | 78 | 958 | 4/2/12 | 2.0 | 2.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3074 | PC001059 | NaN | 82 | 552 | 10/23/12 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3075 | PC001064 | NaN | 82 | 550 | 10/23/12 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3076 | PC001065 | NaN | 82 | 558 | 10/23/12 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3077 | PC001069 | NaN | 1 | 613 | 12/12/12 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |
| 3078 | PC001070 | NaN | 82 | 644 | 10/23/12 | NaN | 1.0 | 0 | 0 | 0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 2.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 | 7.0 |